<!DOCTYPE html>
<html lang="zh-CN">
  <head>
    
<meta charset="UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1"/>


<meta http-equiv="Cache-Control" content="no-transform" />
<meta http-equiv="Cache-Control" content="no-siteapp" />

<meta name="theme-color" content="#f8f5ec" />
<meta name="msapplication-navbutton-color" content="#f8f5ec">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="#f8f5ec">



  <meta name="description" content="Python str decode---error"/>




  <meta name="keywords" content="python, 编码, 八一" />



  <meta name="baidu-site-verification" content="HhUstaSjr0" />



  <meta name="google-site-verification" content="UA-102975942-1" />






  <link rel="alternate" href="/atom.xml" title="八一">




  <link rel="shortcut icon" type="image/x-icon" href="/favicon.ico?v=2.6.0" />



<link rel="canonical" href="https://bay1.top/2017/04/23/Python-str-decode-error/"/>


<link rel="stylesheet" type="text/css" href="/css/style.css?v=2.6.0" />
<link rel="stylesheet" type="text/css" href="/css/prettify.css" media="screen" />
<link rel="stylesheet" type="text/css" href="/css/sons-of-obsidian.css" media="screen" />



  <link rel="stylesheet" type="text/css" href="/lib/fancybox/jquery.fancybox.css" />




  
  <script id="baidu_analytics">
    var _hmt = _hmt || [];
    (function() {
      var hm = document.createElement("script");
      hm.src = "https://hm.baidu.com/hm.js?9a885cc9fb6cd7bcef579deb8efe8a70";
      var s = document.getElementsByTagName("script")[0];
      s.parentNode.insertBefore(hm, s);
    })();
  </script>



  <script id="google_analytics">
    (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
        (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
        m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
        })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

        ga('create', 'UA-102975942-1', 'auto');
        ga('send', 'pageview');
  </script>










    <title> Python str decode---error - 八一 </title>
  </head>

  <body><div id="mobile-navbar" class="mobile-navbar">
  <div class="mobile-header-logo">
    <a href="/." class="logo">八一</a>
  </div>
  <div class="mobile-navbar-icon">
    <span></span>
    <span></span>
    <span></span>
  </div>
</div>

<nav id="mobile-menu" class="mobile-menu slideout-menu">
  <ul class="mobile-menu-list">
    
      <a href="/archives">
        <li class="mobile-menu-item">
          
          
            文章
          
        </li>
      </a>
    
      <a href="/tags">
        <li class="mobile-menu-item">
          
          
            标签
          
        </li>
      </a>
    
      <a href="/about">
        <li class="mobile-menu-item">
          
          
            关于/友链
          
        </li>
      </a>
    
      <a href="/search">
        <li class="mobile-menu-item">
          
          
            站内搜索
          
        </li>
      </a>
    
  </ul>
</nav>

    <div class="container" id="mobile-panel">
      <header id="header" class="header"><div class="logo-wrapper">
  <a href="/." class="logo">八一</a>
</div>

<nav class="site-navbar">
  
    <ul id="menu" class="menu">
      
        <li class="menu-item">
          <a class="menu-item-link" href="/archives">
            
            
              文章
            
          </a>
        </li>
      
        <li class="menu-item">
          <a class="menu-item-link" href="/tags">
            
            
              标签
            
          </a>
        </li>
      
        <li class="menu-item">
          <a class="menu-item-link" href="/about">
            
            
              关于/友链
            
          </a>
        </li>
      
        <li class="menu-item">
          <a class="menu-item-link" href="/search">
            
            
              站内搜索
            
          </a>
        </li>
      
    </ul>
  
</nav>

      </header>

      <main id="main" class="main">
        <div class="content-wrapper">
          <div id="content" class="content">
            
  
  <article class="post">
    <header class="post-header">
      <h1 class="post-title">
        
          Python str decode---error
        
      </h1>

      <div class="post-meta">
        <span class="post-time">
          2017-04-23
        </span>
        
        
        
      </div>
    </header>

    
    

    <div class="post-content">
      
        <p>这是最近在做一道类似平台fast题目时遇到的坑，python??中文？？？字符串？？？ <a id="more"></a></p>
<blockquote>
<p>作者环境python3,IDE:PyCharm，小生总结，有错误请指教</p>
</blockquote>
<p>首先说一下这道题，下面是所需要的脚本代码(正则部分不是我写的，是道长，虽然很简单，但是我还不会)</p>
<figure class="highlight makefile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># coding:utf-8</span></span><br><span class="line">import requests</span><br><span class="line">import base64</span><br><span class="line">import re</span><br><span class="line">url='http://c.bugku.com/web6/'</span><br><span class="line">a=requests.session()</span><br><span class="line">r=a.get(url)</span><br><span class="line">FLAG=r.headers['flag']</span><br><span class="line">p=re.match('(.*)(: )(.*)',(base64.b64decode(FLAG)).decode())</span><br><span class="line">payload=&#123;'margin':base64.b64decode(p.group(3))&#125;</span><br><span class="line">r=a.post(url,data=payload)</span><br><span class="line">print(r.text)</span><br></pre></td></tr></table></figure>
<p>而最初其中的一部分代码是这样的<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">p=re.match(<span class="string">'(.*)(: )(.*)'</span>,(base64.b64decode(FLAG)))</span><br></pre></td></tr></table></figure></p>
<p>它的报错是这样的<br><img src="https://s1.ax1x.com/2018/01/01/pSfsyj.png" alt="str error-1"></p>
<blockquote>
<p>在询问了组长之后，才明白这牵涉到了python 字符串的问题<br>首先FLAG经过base64解码之后是这样的形式：<span style="color: red;">跑的还不错，给你flag吧: MjY4NTMy</span><br>在PyCharm输出是这样的：<span style="color: red;">b’\xe8\xb7\x91\xe7\x9a\x84\xe8\xbf\x98\xe4\xb8\x8d\xe9\x94\x99\xef\xbc\x8c\xe7\xbb\x99\xe4\xbd\xa0flag\xe5\x90\xa7: MjY4NTMy’</span><br>很明显前段部分，正则是没法直接处理的，这样就需要先把这段东东decode为python内部表示的unicode编码，再进行操作</p>
</blockquote>
<p>这样报错的原因呢，我也搜到了大多数的说法。程序本身并没有错，是IDE的错<br><span style="color: red;">在某些IDE中，字符串的输出总是出现乱码，甚至错误，其实是由于IDE的结果输出控制台自身不能显示字符串的编码</span><br>查看系统的编码方式可以用下列代码(我也测试了一下我的，是utf-8)：<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#!/usr/bin/env python</span></span><br><span class="line"><span class="comment">#coding=utf-8</span></span><br><span class="line"><span class="keyword">import</span> sys</span><br><span class="line"><span class="keyword">print</span> sys.getdefaultencoding()</span><br></pre></td></tr></table></figure></p>
<p><strong>然后，在此机会下，也复习了一下python字符串编码的问题</strong></p>
<blockquote>
<p>因为python内部编码是unicode，所以一般在做编码转换时，通常需要以unicode作为中间编码<br>就是说先将其他编码的字符串解码（decode）成unicode，再从unicode编码（encode）成另一种编码<br>具体的方式可以参考下面的代码</p>
</blockquote>
<pre><code class="python">test=test.decode(<span class="string">'gb2312'</span>).encode(<span class="string">'utf-8'</span>)
</code></pre>
<p>即：decode首先把gb2312编码的test用decode搞成unicode，然后再用encode编码成utf-8</p>
<blockquote>
<p>而我们用到的decode()实际上涉及到一个隐式的类型转化，decode()相当于str.decode(sys.defaultencoding).encode()<br>sys.defaultencoding这个东东就很明显了，在上面</p>
</blockquote>
<ul>
<li><strong>一位前辈的总结</strong><blockquote>
<p>在Python3中的str（Python2中的unicode），它是统一码,没有某种具体形式，所以只能被某种具体形式编码<br>只能调用encode，而不能调用decode,同样，Python3中的bytes对象,如b’\xe9\xa9\xb9’(UTF-8编码形式的’驹’字)<br>已经具备某种具体编码，只能被解码还原成unicode统一码，所以它之只能调用decode方法，调用encode会报错</p>
</blockquote>
</li>
</ul>
<p><span style="color: red;">字符串编码真的让人头疼啊。。。</span></p>
<p>关于字符串的链接：<a href="http://www.ruanyifeng.com/blog/2007/10/ascii_unicode_and_utf-8.html" target="_blank" rel="noopener">阮一峰字符编码笔记</a><br><a href="http://ajucs.com/2015/11/10/Python-character-encoding-explained.html" target="_blank" rel="noopener">字符编码及Python中文处理精解</a></p>

      
    </div>

    
      
      



      
      
    

    
      <footer class="post-footer">
        
          <div class="post-tags">
            
              <a href="/tags/python/">python</a>
            
              <a href="/tags/编码/">编码</a>
            
          </div>
        
        
        
  <nav class="post-nav">
    
      <a class="prev" href="/2017/04/29/天朝挖煤的题已经不会做了。。/">
        <i class="iconfont icon-left"></i>
        <span class="prev-text nav-default">天朝挖煤的题已经不会做了。。</span>
        <span class="prev-text nav-mobile">上一篇</span>
      </a>
    
    
      <a class="next" href="/2017/04/22/期中总结/">
        <span class="next-text nav-default">期中总结</span>
        <span class="prev-text nav-mobile">下一篇</span>
        <i class="iconfont icon-right"></i>
      </a>
    
  </nav>

      </footer>
    

  </article>


          </div>
          
  <div class="comments" id="comments">
      <div id="disqus_thread">
        <noscript>
          Please enable JavaScript to view the
          <a href="//disqus.com/?ref_noscript">comments powered by Disqus.</a>
        </noscript>
      </div> 
    </div>
  </div>


        </div>
      </main>

      <footer id="footer" class="footer">

  <div class="social-links">
    
      
        
          <a href="https://github.com/bay1" class="iconfont icon-github" title="github"></a>
        
      
    
      
        
          <a href="http://weibo.com/3190704711/profile?topnav=1&wvr=6&is_all=1" class="iconfont icon-weibo" title="weibo"></a>
        
      
    
      
    
      
    
      
    
    
    
  </div>


<div class="copyright">
  <span class="copyright-year">
    
    &copy; 
     
      2016 - 
    
    2018
    <span class="author">bay1</span>
  </span>
</div>
      </footer>

      <div class="back-to-top" id="back-to-top">
        <i class="iconfont icon-up"></i>
      </div>
    </div>

    
  
  <script type="text/javascript">
    var disqus_config = function () {
        this.page.url = 'https://bay1.top/2017/04/23/Python-str-decode-error/';
        this.page.identifier = '2017/04/23/Python-str-decode-error/';
        this.page.title = 'Python str decode---error';
    };
    (function() {
    var d = document, s = d.createElement('script');

    s.src = '//https-blog-flywinky-top-1.disqus.com/embed.js';

    s.setAttribute('data-timestamp', +new Date());
    (d.head || d.body).appendChild(s);
    })();  
  </script>



    
  





  
    <script type="text/javascript" src="/lib/jquery/jquery-3.1.1.min.js"></script>
  

  
    <script type="text/javascript" src="/lib/slideout/slideout.js"></script>
  

  
    <script type="text/javascript" src="/lib/fancybox/jquery.fancybox.pack.js"></script>
  


    <script type="text/javascript" src="/js/src/even.js?v=2.6.0"></script>
<script type="text/javascript" src="/js/src/bootstrap.js?v=2.6.0"></script>
<script src="/js/prettify.js"></script>
<script type="text/javascript">
$(document).ready(function(){
 $('pre').addClass('prettyprint');
   prettyPrint();
 })
</script>
  </body>
</html>
